Introduction to ZooKeeper
Learn why ZooKeeper was built and the needs it fulfills.
Need for ZooKeeper#
In distributed systems, many processes are running on different nodes that need to collaborate/coordinate with each other to do some task. Coordination can be of different types, for example, to check whether a process is alive or not and what it is responsible for, e.g., resource sharing, group membership, leader election, configuration, etc. Some systems use locking (for example, 2PL and Chubby) to enable resource sharing (a form of coordination) between processes. Other services, like Amazon SQS and Akamai configuration management also use a coordination service as a building block.
Note: Locking is one of the most famous and powerful coordination primitives to avoid race conditions in shared resources.
ZooKeeper is also a coordination system, but we want to know the reasons that led to the development of this new coordination system when we already have systems like 2PL, Chubby, etc. Let’s look into the limitations/problems of the already-developed coordination systems to realize the importance of ZooKeeper.
Problems with old coordination systems#
-
Old coordination systems were specialized for different coordination needs:
- Chubby implements a locking primitive and provides strong synchronization guarantees.
- Amazon Simple Queue Service (ASQS) is a coordination service for the queuing system.
- A robust and lightweight stable leader election.
- Akamai’s configuration management system is specialized in configuration management.
-
The old coordination systems use blocking primitives like locks, due to which slow/faulty clients can negatively affect the performance of fast clients. If we start processing the requests based on detecting the failure of other clients, that will increase the complexity while implementing such a service.
Note: We need a system similar to Chubby but without the locking service.
The novelty in ZooKeeper#
The main contributions of ZooKeeper are as follows:
- Custom coordination service: Instead of limiting application developers to a predefined collection of coordination primitives/services, it enables the developers to create their own coordination primitives based on their application needs.
- Wait-free service: It moves away from blocking primitives like locks, which provide strong consistency guarantees but slow down the performance of fast clients due to slow/faulty clients. Instead it proposes a wait-free service with relaxed consistency guarantees.
The coordination kernel refers to the wait-free, relaxed consistency-guaranteed service offered by ZooKeeper. With the help of the ZooKeeper coordination kernel, numerous crucial applications can implement different coordination primitives without changing the underlying service core.
Functional requirements#
The functional requirement for ZooKeeper is to coordinate processes. As discussed above, there are different kinds of coordination, so instead of giving users a fixed set of coordination primitives, we want to provide the users/developers with an API through which they can create their own primitives according to their application requirements and use them. Therefore, broadly, the functional requirement for the ZooKeeper is as follows:
- Design and implement the client API: To let clients (application developers) create custom coordination primitives.
Non-functional requirements#
The non-functional requirements for ZooKeeper are as follows:
- Good performance: The system should be highly available so that each client request will be entertained without any wait.
- ZooKeeper should be able to distribute the load across the servers in such a way that the system can achieve high throughput.
- ZooKeeper should be able to reduce the latency of requests generated by using the client API.
- Simple design: The complexity of the ZooKeeper design and implementation should be simple to give the user more space to play with the services.
ZooKeeper#
ZooKeeper is a system that helps application developers build coordination services through its client API. ZooKeeper's client API combines components from distributed lock services, shared registers, and group messaging into a replicated centralized coordination service.
Let's understand ZooKeeper’s high-level design.
High-level design#
ZooKeeper’s high-level design is shown in the following illustration. It mainly consists of two components, the client API and the ZooKeeper server.
- ZooKeeper clients: The clients are the applications that use ZooKeeper as a coordination service for their application processes.
- ZooKeeper client API: It is in the ZooKeeper client library. The API provides functions such as
create(),delete(),exists(), and many more to manage and use the coordination data. Through this API, the client request is forwarded to the ZooKeeper server. The detailed API functions are discussed in the next lesson.Note: Zookeeper API resembles the API of file systems. The API signature looks like Chubby without lock methods (open and close).
- ZooKeeper server: The server represents a process that provides the ZooKeeper coordination service. It stores all the coordination data from different applications and their processes in memory. The namespace for applications/clients and their coordination data are organized in a hierarchy (in the form of a tree). The client application processes store their coordination data on znodes. These processes can perform all the operations provided in the ZooKeeper client API. Each znode can be accessed through its path in the standard UNIX notation (like having
/for the root directory). A single server is shown in the illustration above, but there is a set of ZooKeeper servers called the ZooKeeper ensemble. All are replicas. One is elected as the leader, while others become the followers. The detail on leader-follower servers is in the next lesson.
Design choices#
Along with wait-free data objects, the other ZooKeeper design decisions that enable the implementation of a high-performance (hundreds of thousands of transactions per second) processing pipeline are as follows:
-
We need first in first out (FIFO) execution for client requests.
-
We require linearizability to handle requests from clients that alter the state of ZooKeeper. Linearizable writes provide us with the efficient implementation of our coordination service and also use universal objects.
Bird’s eye view#
In the next lessons, we’ll design and evaluate ZooKeeper. The following concept map is a quick summary of the problem ZooKeeper solves and its novelties.
In this lesson, we have learned about the basic details of the ZooKeeper. In the next lesson, we’ll learn about the components in detail.
Quiz on Chubby
Detailed Design of ZooKeeper